Discovering Causality in Large Databases

نویسندگان

  • Shichao Zhang
  • Chengqi Zhang
چکیده

Acausal rule between two variables, X! Y , captures the relationship that the presence of X causes the appearance of Y. Because of its usefulness (compared to association rules), techniques for mining causal rules are beginning to be developed. However, the effectiveness of existing methods (such as the LCD and CU-path algorithms)are limited to mining causal rules among simple variables, and are inadequate to discover and represent causal rules among multi-value variables. In this paper, we propose that the causality between variables X and Y be represented in the form X! Y with conditional probability matrix MY jX: We also propose a new approach to discover causality in large databases based on partitioning. The approach partitions the items into item variables by decomposing ` `bad’’ item variables and composing ` ` not-good’’ item variables. In particular, we establish a method to optimize causal rules that merges the ` ` useless’’ information in conditional probability matrices of extracted causal rules.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Knowledge from Medical Databases

We investigate new approaches for knowledge discovery from two medical databases. Two different kinds of knowledge, namely rules and causal structures, are learned. Rules capture interesting patterns and regularities in the database. Causal structures represented by Bayesian networks capture the causality relationships among the attributes. We employ advanced evolutionary algorithms for these d...

متن کامل

Applying Evolutionary Algorithms to Discover Knowledge from Medical Databases

Data mining has become an important research topic. The increasing use of computer results in an explosion of information. These data can be best used if the knowledge hidden can be uncovered. Thus there is a need for a way to automatically discover knowledge from data. In this paper, new approaches for knowledge discovery from two medical databases are investigated. Two different kinds of know...

متن کامل

Discovering Topical Structures of Databases

In today’s enterprise world, the scale of the databases and the increasing complexity of these databases and the prevalent lack of documentation make it hard for a data architect to understand, reverse engineer and integrate the databases. In this paper, the problem of discovering topical structures of databases to support semantic browsing and large scale data integration is addressed. The iDi...

متن کامل

A Proposed Data Mining Methodology and its Application to Industrial Procedures

Data mining is the process of discovering correlations, patterns, trends or relationships by searching through a large amount of data stored in repositories, corporate databases, and data warehouses. Industrial procedures with the help of engineers, managers, and other specialists, comprise a broad field and have many tools and techniques in their problem-solving arsenal. The purpose of this st...

متن کامل

Discovering Multi-head Attributional Rules in Large Databases

A method for discovering multi-head attributional rules in large databases is presented and illustrated by results from an implemented program. Attributional rules (a.k.a. attributional dependencies) can be viewed as generalizations of standard association rules, because they use more general and expressive conditions than those in the latter ones, and by that can express more concisely inter-a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Applied Artificial Intelligence

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2002